State-wide data sharing system for a thin layer of person, address, relationship, event and participation data
A new government project team was created and mandated to collect and share data within the state to benefit the vulnerable cohort represented in siloed datasets.
Factil and Information Management Systems was contracted to create business information models (BIM) and outline how data matching might be implemented across these disparate datasets. The database systems where of various ages, types and size. Many of the datasets included missing, incomplete, low quality and duplicated data.
The business information model was used to create mappings from the source system to the centralised Kalinda system. Extract, transform and load procedures where then performed on these datasets to collate a thin layer of person, location, relationships, events and participation information.
The data was then loaded into the Kalinda System developed in a previous project. The performance and capabilities of the Kalinda system where significantly uplifted during this project. For example, the project was designed to only consider persons with last names starting with the letter ‘M’. However, the Kalinda system was performing so well, that all ~10 million? records from 13 databased were loaded into the system. Record deduplication, data matching and entity resolution was able to be performed in overnight batch processes for both logistic regression and probabilistic matching algorithms.
Significant effort was placed into creating a system to assess the quality and usefulness of different matching techniques and display the results.
These performance and feature enhancements continue today with the short term goal of developing an on-premise or cloud based system that can enhance and assess MDM solutions from the main players in the industry.
The project was delivered on time and on budget.